substitution pattern
From Unstructured Data to Demand Counterfactuals: Theory and Practice
Christensen, Timothy, Compiani, Giovanni
Empirical models of demand for differentiated products rely on low-dimensional product representations to capture substitution patterns. These representations are increasingly proxied by applying ML methods to high-dimensional, unstructured data, including product descriptions and images. When proxies fail to capture the true dimensions of differentiation that drive substitution, standard workflows will deliver biased counterfactuals and invalid inference. We develop a practical toolkit that corrects this bias and ensures valid inference for a broad class of counterfactuals. Our approach applies to market-level and/or individual data, requires minimal additional computation, is efficient, delivers simple formulas for standard errors, and accommodates data-dependent proxies, including embeddings from fine-tuned ML models. It can also be used with standard quantitative attributes when mismeasurement is a concern. In addition, we propose diagnostics to assess the adequacy of the proxy construction and dimension. The approach yields meaningful improvements in predicting counterfactual substitution in both simulations and an empirical application.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Automobiles & Trucks (0.46)
- Education (0.46)
NestGNN: A Graph Neural Network Framework Generalizing the Nested Logit Model for Travel Mode Choice
Zhou, Yuqi, Cheng, Zhanhong, Hu, Lingqian, Bu, Yuheng, Wang, Shenhao
Nested logit (NL) has been commonly used for discrete choice analysis, including a wide range of applications such as travel mode choice, automobile ownership, or location decisions. However, the classical NL models are restricted by their limited representation capability and handcrafted utility specification. While researchers introduced deep neural networks (DNNs) to tackle such challenges, the existing DNNs cannot explicitly capture inter-alternative correlations in the discrete choice context. To address the challenges, this study proposes a novel concept - alternative graph - to represent the relationships among travel mode alternatives. Using a nested alternative graph, this study further designs a nested-utility graph neural network (NestGNN) as a generalization of the classical NL model in the neural network family. Theoretically, NestGNNs generalize the classical NL models and existing DNNs in terms of model representation, while retaining the crucial two-layer substitution patterns of the NL models: proportional substitution within a nest but non-proportional substitution beyond a nest. Empirically, we find that the NestGNNs significantly outperform the benchmark models, particularly the corresponding NL models by 9.2\%. As shown by elasticity tables and substitution visualization, NestGNNs retain the two-layer substitution patterns as the NL model, and yet presents more flexibility in its model design space. Overall, our study demonstrates the power of NestGNN in prediction, interpretation, and its flexibility of generalizing the classical NL model for analyzing travel mode choice.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Texas (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Energy (0.93)
- Government > Regional Government > North America Government > United States Government (0.93)
- Transportation (0.74)
Graph neural networks for residential location choice: connection to classical logit models
Cheng, Zhanhong, Hu, Lingqian, Bu, Yuheng, Zhou, Yuqi, Wang, Shenhao
Researchers have adopted deep learning for classical discrete choice analysis as it can capture complex feature relationships and achieve higher predictive performance. However, the existing deep learning approaches cannot explicitly capture the relationship among choice alternatives, which has been a long-lasting focus in classical discrete choice models. To address the gap, this paper introduces Graph Neural Network (GNN) as a novel framework to analyze residential location choice. The GNN-based discrete choice models (GNN-DCMs) offer a structured approach for neural networks to capture dependence among spatial alternatives, while maintaining clear connections to classical random utility theory. Theoretically, we demonstrate that the GNN-DCMs incorporate the nested logit (NL) model and the spatially correlated logit (SCL) model as two specific cases, yielding novel algorithmic interpretation through message passing among alternatives' utilities. Empirically, the GNN-DCMs outperform benchmark MNL, SCL, and feedforward neural networks in predicting residential location choices among Chicago's 77 community areas. Regarding model interpretation, the GNN-DCMs can capture individual heterogeneity and exhibit spatially-aware substitution patterns. Overall, these results highlight the potential of GNN-DCMs as a unified and expressive framework for synergizing discrete choice modeling and deep learning in the complex spatial choice contexts.
- North America > United States > Illinois > Cook County > Chicago (0.25)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
- (6 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Banking & Finance (0.68)
- Energy (0.68)
- Transportation (0.67)
Demand Estimation with Text and Image Data
Compiani, Giovanni, Morozov, Ilya, Seiler, Stephan
We propose a demand estimation method that leverages unstructured text and image data to infer substitution patterns. Using pre-trained deep learning models, we extract embeddings from product images and textual descriptions and incorporate them into a random coefficients logit model. This approach enables researchers to estimate demand even when they lack data on product attributes or when consumers value hard-to-quantify attributes, such as visual design or functional benefits. Using data from a choice experiment, we show that our approach outperforms standard attribute-based models in counterfactual predictions of consumers' second choices. We also apply it across 40 product categories on Amazon and consistently find that text and image data help identify close substitutes within each category.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Law (1.00)
- Government > Regional Government > North America Government > United States Government (0.93)
- Leisure & Entertainment > Games > Computer Games (0.68)
- Retail > Online (0.68)